[PRIM][IR]Complete IR vjp code gen for more vjp code gen #56798

Charles-hit · 2023-08-30T07:19:31Z

PR types

New features

PR changes

Others

Description

Pcard-66975
背景：
在PR PR56512基础上进行反向算子vjp生成扩量工作，高优支持GPT/LLama依赖的26个高优算子vjp.
PR改动：

完善IR下算子vjp代码生成，支持compat yaml信息解析
vjp支持可变attribute
新IR下可变attribute会变成输入形式，但是组合算子组网时如果需要对attribute值进行判断那么就要求attribute是一个常量。因此，vjp接口可变attribute统一为输入形式，但是在内部针对组合非组合模式进行变换，组合模式如果是可变attribute，就会根据输入找到真正的attribute值，非组合模式直接透传即可。
拿sum算子举例，生成的vjp代码如下:

std::vector<std::vector<paddle::Tensor>> sum_vjp(const Tensor& x, const Tensor& out_grad, const Tensor& axis_, bool keepdim, bool reduce_all, const std::vector<std::vector<bool>>& stop_gradients) {
  std::vector<std::vector<paddle::Tensor>> vjp_res;
  for (auto arg: stop_gradients) {
    vjp_res.push_back(std::vector<paddle::Tensor>(arg.size()));
  }
  if (paddle::prim::StaticCompositeContext::Instance().IsBwdPrimEnabled()) {
    paddle::Tensor* x_grad = !stop_gradients[0][0] ? &vjp_res[0][0] : nullptr; 
    auto* axis_define_op = std::static_pointer_cast<primitive::LazyTensor>(axis_.impl())->getValue().dyn_cast<ir::OpResult>().GetDefiningOp();
    if(axis_define_op->name() != "pd.full_int_array"){
      PADDLE_THROW(platform::errors::Unimplemented(
          "We don't support dynamic tensors attribute axis for sum_grad composite "
          "for now. "));
    }
    auto axis = axis_define_op->attribute("value").dyn_cast<paddle::dialect::IntArrayAttribute>().data();

    details::sum_grad<LazyTensor>(x, out_grad, axis, keepdim, reduce_all, x_grad);
  } else {
    auto op_res = backend::sum_grad<LazyTensor>(x, out_grad, axis_, keepdim, reduce_all);
    vjp_res[0][0] = !stop_gradients[0][0] ? op_res : vjp_res[0][0];
  }
  return vjp_res;

}

sum组网api在微分层会生成可变attribute为输入和attribute两种形式:

Tensor sum<LazyTensor>(const Tensor& x, const Tensor& axis_, DataType dtype, bool keepdim) {
  ir::OpResult x_res = std::static_pointer_cast<LazyTensor>(x.impl())->getValue().dyn_cast<ir::OpResult>();
  ir::OpResult axis_res = std::static_pointer_cast<LazyTensor>(axis_.impl())->getValue().dyn_cast<ir::OpResult>();
  auto op_res = paddle::dialect::sum(x_res, axis_res, dtype, keepdim);
  Tensor out(std::make_shared<LazyTensor>(op_res));
  return out;
}

Tensor sum<LazyTensor>(const Tensor& x, const IntArray& axis, DataType dtype, bool keepdim) {
  ir::OpResult x_res = std::static_pointer_cast<LazyTensor>(x.impl())->getValue().dyn_cast<ir::OpResult>();
  auto op_res = paddle::dialect::sum(x_res, axis.GetData(), dtype, keepdim);
  Tensor out(std::make_shared<LazyTensor>(op_res));
  return out;
}

适配intermediate输出，在组网api中不输出
拿reshape举例，xshape作为intermediate输出将不会输出

Tensor reshape<LazyTensor>(const Tensor& x, const Tensor& shape_) {
  ir::OpResult x_res = std::static_pointer_cast<LazyTensor>(x.impl())->getValue().dyn_cast<ir::OpResult>();
  ir::OpResult shape_res = std::static_pointer_cast<LazyTensor>(shape_.impl())->getValue().dyn_cast<ir::OpResult>();
  auto op_res = paddle::dialect::reshape(x_res, shape_res);
  Tensor out(std::make_shared<LazyTensor>(op_res));
  return out;
}

vjp中组网api适配多输入多输出算子
多输入多输出：

std::vector<Tensor> concat_grad<LazyTensor>(const std::vector<Tensor>& x, const Tensor& out_grad, const Scalar& axis) {
  std::vector<ir::OpResult> x_res(x.size());
  std::transform(x.begin(), x.end(), x_res.begin(), [](const Tensor& t) {
    return std::static_pointer_cast<LazyTensor>(t.impl())->getValue().dyn_cast<ir::OpResult>();
  });
  ir::OpResult out_grad_res = std::static_pointer_cast<LazyTensor>(out_grad.impl())->getValue().dyn_cast<ir::OpResult>();
  auto op_res = paddle::dialect::concat_grad(x_res, out_grad_res, axis.to<int>());
  std::vector<Tensor> x_grad(op_res.size());
  std::transform(op_res.begin(), op_res.end(), x_grad.begin(), [](const ir::OpResult& res) {
    return Tensor(std::make_shared<LazyTensor>(res));
  });
  return x_grad;
}

接下来要完成的工作：

GPT/LLama依赖高优算子vjp全量生成
optional支持以及反向空梯度语义表示

… support_mutable_attributes

paddle-bot · 2023-08-30T07:19:36Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

…e/Paddle into support_mutable_attributes

cyber-pioneer · 2023-08-31T08:14:19Z

paddle/fluid/operators/generator/tests_utils.py

+def is_mutable_attribute(attr):
+    return (
+        attr['typename'] in ['Scalar', 'IntArray']
+        and attr['support_tensor'] is True


属于这两类'Scalar', 'IntArray'，但是没有support_tensor属性的算子怎样处理的？

会变成常量处理

… support_mutable_attributes

cxxly · 2023-09-01T03:20:17Z

使用有意义的PR描述，比如该PR通过完善codegen逻辑扩量支持GPT/LLama依赖的XX个高优算子
详细描述中，可以介绍，具体完善了哪些功能

cxxly · 2023-09-01T03:35:57Z

paddle/fluid/primitive/codegen/templates/backend/generated/generated_static_backend.cc.j2

-  );
-  {% elif outputs|length == 1 %}
-  return Tensor(std::make_shared<LazyTensor>(op_res));
+  return std::make_tuple({% for i in range(outputs|length) %}{{outputs[i].name}}{%- if i!=outputs|length - 1 -%}, {% endif %}{% endfor %});


拆分成多个宏函数，原则上一个函数长度建议控制在50行以内，一个模块的长度不超过一屏

这儿在下个pr统一修改，先不阻塞他人工作合入一版。

cxxly · 2023-09-01T03:41:56Z

paddle/fluid/primitive/codegen/templates/rule/vjp/generated/generated_vjp.cc.j2

+      {% endif %}
+    {% endif %}
+  {% endfor %}
+{% endmacro %}


编译器常量推断(常量折叠)是一种比较常见技术，可以单独封装成一个可读性高的函数来表明部分代码功能

这儿会在下个pr统一修改

cxxly

LGTM

Aurelius84

LGTM

Aurelius84 · 2023-09-01T06:09:14Z

paddle/fluid/primitive/codegen/templates/rule/vjp/generated/generated_vjp.cc.j2

+      {% endif %}
+    {% endif %}
+  {% endfor %}
+{% endmacro %}


heavyrain-lzy

LGTM for tests_utils.py

heavyrain-lzy · 2023-09-01T06:45:51Z

paddle/fluid/operators/generator/tests_utils.py

+    for attr in attrs:
+        if (
+            attr['typename'] in ['Scalar', 'IntArray']
+            and attr['support_tensor'] is True


Scalar的类型不仅仅是Scalar还可能是Scalar(int) Scalar(int64_t)等，这个函数可以借助tests_utils.py中的

def is_scalar(s): return re.match(r"Scalar(\(\w+\))*", s) is not None def is_intarray(s): return s == 'IntArray'

进行判断。
2. 新IR下可变attribute是否需要对：

attr['tensor_name'] is not None or attr['tensors_name'] is not None

进行判断。

感谢提醒，第一个点我理解有明确类型数据类型应该不需要修改了，第二点已经在gen.py中进行处理了。

…e#56798) * Fix attr type error like concat axis * Fix None input error * Fix intermediate output * support vjp code gen --------- Co-authored-by: 0x45f <wangzhen45@baidu.com>

0x45f and others added 5 commits August 28, 2023 07:45

Fix attr type error like concat axis

3ebef77

Fix None input error

b68ad4e

Fix intermediate output

0d9f929

support vjp code gen

5705c69

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

7c675fc

… support_mutable_attributes

Charles-hit force-pushed the support_mutable_attributes branch 4 times, most recently from d5f618b to 8bb28a3 Compare August 31, 2023 01:35

Merge commit 'refs/pull/56712/head' of https://github.com/PaddlePaddl…

8bb28a3

…e/Paddle into support_mutable_attributes

cyber-pioneer reviewed Aug 31, 2023

View reviewed changes

Charles-hit force-pushed the support_mutable_attributes branch 2 times, most recently from 8cc333a to 033acf4 Compare August 31, 2023 16:34

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

033acf4

… support_mutable_attributes

cxxly reviewed Sep 1, 2023

View reviewed changes

Charles-hit changed the title ~~[PRIM][IR]Complete IR vjp code gen~~ [PRIM][IR]Complete IR vjp code gen for more vjp code gen Sep 1, 2023

cxxly approved these changes Sep 1, 2023

View reviewed changes

Aurelius84 approved these changes Sep 1, 2023

View reviewed changes

heavyrain-lzy approved these changes Sep 1, 2023

View reviewed changes

raindrops2sea approved these changes Sep 1, 2023

View reviewed changes

Charles-hit merged commit 4abea95 into PaddlePaddle:develop Sep 1, 2023

Charles-hit mentioned this pull request Aug 31, 2023

【艾尔登】任务线一：「星星时代 ⭐️」之「IR 核心组件」 #55734

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PRIM][IR]Complete IR vjp code gen for more vjp code gen #56798

[PRIM][IR]Complete IR vjp code gen for more vjp code gen #56798

Charles-hit commented Aug 30, 2023 •

edited

Loading

paddle-bot bot commented Aug 30, 2023

cyber-pioneer Aug 31, 2023

Charles-hit Sep 1, 2023

cxxly commented Sep 1, 2023 •

edited

Loading

cxxly Sep 1, 2023

Charles-hit Sep 1, 2023

cxxly Sep 1, 2023

Aurelius84 Sep 1, 2023

Charles-hit Sep 1, 2023

cxxly left a comment

Aurelius84 left a comment

Aurelius84 Sep 1, 2023

heavyrain-lzy left a comment

heavyrain-lzy Sep 1, 2023

Charles-hit Sep 1, 2023 •

edited

Loading

[PRIM][IR]Complete IR vjp code gen for more vjp code gen #56798

[PRIM][IR]Complete IR vjp code gen for more vjp code gen #56798

Conversation

Charles-hit commented Aug 30, 2023 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Aug 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cxxly commented Sep 1, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cxxly left a comment

Choose a reason for hiding this comment

Aurelius84 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

heavyrain-lzy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Charles-hit Sep 1, 2023 • edited Loading

Choose a reason for hiding this comment

Charles-hit commented Aug 30, 2023 •

edited

Loading

cxxly commented Sep 1, 2023 •

edited

Loading

Charles-hit Sep 1, 2023 •

edited

Loading